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Overview 


Introduction to clock synchronization protocols? 

A schematic formulation of clock 
synchronization (Schneider). 

The Interactive Convergence Algorithm 
( Lam port/ Melliar-Smith). 

Verification of Schneider’s formulation 
(Shankar). 

Verification of Interactive Convergence 
(Rushby/von Henke). 

A hardware-oriented clock synchronization 
protocol (Infis/Moore). 

Verification of Infis/Moore’s protocol 
(Rush by/Shankar). 

The EHDM Specification/Verification 
Environment. 


Conclusions. 
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Main Observations 


Fault-tolerant clock synchronization is a 
critical component of a real-time control 
system. 

Proofs of the correctness of clock 
synchronization are complex and subtle. 

Informal proofs tend to be tenuous in these 
domains. 

Formal verification is a useful way to reduce 
errors and achieve reliable designs. 

Specification/Verification could contribute to 
the scientific foundations of reliable 


engineering. 


Fault-tolerant systems 


• Critical real-time control systems such as 
“fly-by-wire” digital avionics. 

• Replicated processors are used to provide 
hardware fault-tolerance. 

• Results are periodically voted. 

• Clocks must be synchronized to ensure 
approximately synchronous behaviour across 
nonfaulty processors. 
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Clock Synchronization 


• Clocks start synchronized. 

• Over time, the clocks drift apart. 

• The clocks are periodically synchronized by 

o an exchange of clock values 

o computation of a mutually agreeable 
clock value 

o adjustment of the logical clock 
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Byzantine Clocks 


Three clocks A, B, C. 

Suppose clocks drift away from real time by upto 
a minute an hour. 

C is faulty. 

Clocks resynchronize around noon and exchange 
clock values. 

A reads 12 : 00 and B reads 11 : 59 

A transmits 12 : 00 to B and C . 

B transmits 11 : 59 to A and C. 

C maliciously transmits 12 : 01 to A\ 11 : 58 to 
B. 


c 
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Byzantine Clocks 


Three clocks A, B, C. 

Clocks drift from real time by upto a minute an 
hour. 

C is faulty. 

Clocks resynchronize around noon and exchange 
clock values. 

A reads 12 : 00 and B reads 11 : 59 

A resets its clock to the mean of the acceptable 
clock values, i.e., 12 : 00. 

B similarly resets itself to 11 : 59. 

A and B are not any closer following 
resynchronization. 
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Clock Generalities 


No global clocks — single point of failure, 
therefore not fault-tolerant. 

Synchronization is with respect to other clocks, 
not real time, though such protocols do exist. 

Clocks drift at rate p with respect to real time. 

Period of drift R between resynchronization 
rounds. 

e bounds the error in reading clock values. 

To keep clocks synchronized to within 6, clocks 
should be within 6 S following resynchronization, 
and 

6 > 6s H - 2 pR 

Each clock uses the same convergence function 
to synchronize to within S s - 
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So ^ 


Typical numbers (from Rushby/von Henke) 


Parameter 



€ 


P 

6 


Value 

Explanation 

6 

104.8 msec. 

132 n sec. 

66.1 /Ltsec. 

15 x 10 -6 

271 nsec. (F = 1) 

No. of Clocks 
Period 
Initial skew 
Reading error 
Drift rate 
Maximum skew 
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Clock Requirements 

• Rl: At any instant, two nonfaulty clock 
readings should be no further than 6 apart. 

• R2: There should be a small bound on the 
adjustment needed to resynchronize a clock. 
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Schneider’s Schema 


A generalization of various protocols consisting 
of: 


• Assumptions on the behavior of nonfaulty 
physical clocks. 

• Constraints on the computation of nonfaulty 
logical clocks. 

These assumptions and constraints are used to 
derive a bound on the skew between two 
nonfaulty logical clocks, i.e. 

\LC p (t) - LC q (t)\ < 6 
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Physical Clock Assumptions 

N clocks with at most F faulty. 

t l v is the time at which p resets its clock for the 
z’th time. 


Interval between resets is bounded: 


r min < tp 


i+1 


t l p ^ Trnax 


Skew between resets is bounded: | t 7 p — t l q \ < (3 

Bounded drift rate w.r.t. real time: for s > t 
(s — t)( 1 — p) < Op(s) — Cp(t ) <5 («s — t ) ( 1 *4“ p) 
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Logical Clock Assumptions 


A Convergence function Cfn is used to compute 
the adjusted logical clock. 

Let ©1(g) be p’s reading (estimate) of q's logical 
clock at time t l p . 

Then LCp(f'p) = Cfn(p,© l p) 

The i’th adjustment to be applied to the 
physical clock to derive the logical clock is 

Adj], = Cfn(p , ©p) - Cp(tp) 

In general the logical clock is defined to be 

LC p (t) = C p (t ) + Adfp 
for tip<t< t l p +1 

e bounds error with which clocks are read. 

Additionally, certain assumptions on behavior of 
a satisfactory convergence function. 
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Translation Invariance 

Adding X to each clock reading, adds X to the 
value of the convergence function. 

For any X and 0 mapping clock numbers to 
clock readings 

Cfn(p , (A q '.O(q) + X)) = Cfn(p,9) + X 

Translation invariance is used to compare the 
values of convergence functions at t l p and t l q . 
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Precision Enhancement 

Formalizes the intuition that 

• the closer the good clocks are to each other 

• the closer the different readings of the same 
good clock 

• then the closer the resulting convergence 
function values 
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Precision Enhancement (contd.) 

Given any predicate P on clocks 0 to N — 1 that 
holds of at least N — F clocks. 

Given p, q, such that P(p ) and P(q). 

Given 0 P and 9 q such that 

• If P(l) and P(m), then \0 p (l) - 0 v {m)\ < Y 

• If P(l) and P(m), then 1 0 q (l) - 0 q (m) \ < Y 

• If P(Z), then \0 p (l) - 0 q (l)\ < X 

Then there exists a bound 7r(X,T) such that 

\Cfn(p, 0 P ) - Cfn(q, 0 q )\ < n(X,Y) 

Illustrative example to follow. 
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Accuracy Preservation 


Bounds the adjustment away from a good clock 
reading. 

Given any predicate P on clocks 0 to N - 1 that 
holds of at least N — F clocks. 

Given that P holds of p and q. 

Given 0 P such that whenever P(l) and P(rn) for 
any two clocks l and m, then 

\0 p (l) - 0 p (m)\ < Z 


Then 

\Cfn(p, 0 P ) - 0 v {q ) | < a(Z) 

That is, if the good clock readings are within Z, 
the adjustment away from a good clock reading 
is no more than a(Z). 
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The Final Result: Agreement 


• Al. (3 ^ 7' min 

Synchronization rounds are distinct 

• A2: <S 0 < 6 S 

Initial skew no greater than skew 
immediately following synchronization. 

• A3: 6s + 2 pvmax ^ 6 

Drift between synchronization rounds is 

below 6. 

• A4: 7r(2e + 2 pf3, 6 S + 2 p(r in ax + P) + 2e) < <5$ 
Skew between just synchronized clocks below 

• A5: a(6 s + 2 p(r-max + P) + 2e) < <5 

Skew between synchronized and yet to be 
synchronized clocks below 6. 
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Conclusion: 


t > 0 

A correct(p,£) 

A correct(g,0 
=> \LC(p,t) - LC(q,t)\<6 

Skew between nonfaulty logical clocks 

bounded by S. 



Verification of Schneider's Schema using 

EHDM 

Proof consists of: 

• 30 axioms involving multiplication, division, 
and clocks. 

• 12 definitions 


• 95 lemmas. 


Proof took about two man-months using EHDM. 

Machine verification takes 1000 to 3500 CPU 
secs on SUNs. 

Numerous inaccuracies in Schneider’s original 
presentation were corrected. 

The machine proof adds enormous clarity to 
Schneider’s insightful, but imprecise descriptions 
and definitions. 


Instantiation of Schneider’s 


schema in progress. 
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Lamport/Melliar-Smith’s Interactive 
Convergence (ICA) 

3F + 1 clocks needed to tolerate F Byzantine 
faults. 

p records (relative discrepancies of) other clock 
values when its clock reads iR 

“Ignores" clock readings further than A away. 

Adjusts its clock by the ‘egocentric' mean of the 
acceptable clock differences. 
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Instantiating Schneider's protocol with ICA 


Convergence function: 


where 


/v-1 

%C(l\P , 0) 2_^_Q 


, a . _ , x if | a: - 6(p)\ < A 
fixp(x, 0) ^ otherwise 


Translation Invariance: Note that 

fix p ((\l : 0(0 + 0 (q)J ^ 
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Precision Enhancement of ICA 


Given that for all correct l, m 

• | Op(l) — Oq(l)\ < X 

• \0p(l) - Op(m)\ < Y 


\6 q (l) - Oq(m)\ < Y 


We have 


< X + 


ica(p, 9 P ) - icci,(q,0q)\ 
FY+2FA 


7T 


( X,Y ) 


N 


X is negligible, but Y « A, so 

3FA 


*(X,Y) 


N 


Since A > S + e, we get N > 3F + 1. 



Accuracy Preservation of ICA 

If nonfaulty clock readings are Z apart, then F 
faulty clocks can contribute a further skew of 
FA/JV to the egocentric mean. 

So 

a(Z) <Z+^ 
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Rushby/von Henke’s verification of ICA 

using EHDM 

Around 1-2 man month effort 
20 modules 

1,550 lines of specification 
166 proofs 

1 hour elapsed to prove them all on Sun 3/75-8 

Verification revealed several minor flaws in a five 
year old journal proof. 
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Flaws in Lamport/Melliar-Smith 

Main induction incorrect (bad approximations) 

Proof of Lemma 4 incorrect (bad 
approximations); also typographical error in 
statement 

Lemma 1 false in absence of additional 
constraints in A2 

Lemma 2 similarly, also typographical error in 
statement 

Lemma 3 similarly, and unnecessarily general 

Missing requirement for S2 in Lemmas 1, 3, 4, 
and (when repaired) 2 
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Original Constraints on parameters 


Cl: 

C2: 

C3: H = A 
C4: A ~ 6 + e 
C5: 8~8 0 + pR 

C6: <5 > 2(e + pS) + 2mA + 
— v n — m 


npR 
n — m 
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New Constraints on parameters 


Cl: R > 35 


C2: S > E 
C3: E > A 

C4: A > <5 + e + f S' 


C5: > <5 0 + pR 


C6: 

6 > 


2(e + pS) + 


2mA . 11 pH _j_ 

n — m n — in 


np5 1 
n — m 


+ P A 
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Infis/Moore's economic approach 

Tolerates F < N/2 omission failures for N clocks. 

At clock reading iR, p broadcasts a pulse on its 
private line. 

Say p receives and validates N — f pulses 

(TV-F)’th pulse bounded from above and below 
by a good pulse. 

Ditto for (F - / + l)'th pulse. 

p starts new clock at earlier of pulse N - F with 
delay D, or pulse F - f + 1 with delay 2D. 

Skew 6s ~ D, and 6 < 2D. 

Verification nearly complete using EHDM. 
Elaborates significantly on informal proof. 
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Schemata for Infis/Moore’s protocol 


PUI.SES 









Extract from Infis/Moore 


{a \ r* _ > T _ , because the 7? are a subset of the T] 

,i\ T-k < 7~ because at least one of the times 7„_ 
... T‘ " must be” a message from a processor which « 
actually^ fault-free (and synchronised) and T n _ m is ei er 

£ .1™ ol the mcUff from .h. I.« fa-M™ 

OF 7 k _ , ^ 7 _ m because the 7„_ m is validated^by all 

fault-free processors and must be included in the 7i 
( J) 7* - / ^ 7„ _ ^ because the 7* are a subset of the 7, . 

From these inequalities we have that 

min {7 n _, + d, 7„_ m } ^ ^ min 4- d , V,} (D 

Now 7* r t^7 for all * and 7^, = for some 

£ so the validity" tests T*.-, - Tf-/ + . < U ^ hat 

r — T < 2d. Therefore T„.„ - T„_, < d or T._, 

—~T‘ m < d (or both), 
if t — 7 , < d, eqn. 1 reduces to 

7 ^ W ^ min {7 n _ m + d , 7 n _ 5 } 

implying that IV has a range of at most d 

If j _ 7 n _ m < d, then, using also that l n - g i„-, < 

2d, eqn. 1 yields 

T — d < W < T„_, 

imolvine that W has a ranee less than d. 

, oj 



Verification of Infis/Moore’s protocol 

Formalization is fairly close to hardware 
realization. 

Main induction over synchronization rounds 
completed, as well as all of the important 
lemmas. 

Machine proof is remarkably involved and 
complex. 

Proof took two man-months of effort and covers 
about 70 dense pages. 
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Common Errors 


Ignoring failures. 

Distinguishing real and clock time, and relative 
versus absolute measurements. 

Ignoring small but significant Quantities. 

Proving one statement but using another. 

Imprecise definitions. 

Erroneous algebraic manipulations. 

Implicit assumptions. 

Incorrect assumptions. 
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Difficulties in verification 


Dealing simultaneously with failures, temporal 
ordering, relative measurements, drift. 

Have to be careful not to assume anything 
about failed clocks. 

“Circular definitions” need to be avoided. 
E.g., A round ends when various events have 

taken place. 

Various events take place as scheduled if the 
clock is correct at the end of the round. 

Mentally retaining all the relevant facts is 
difficult. 


34 



EHDM specification/verification system 

Based on a simply typed higher-order logic with 
subtyping. 

t 

Parametric modules used to structure 
specifications. 

Specifications can be proved to implement other 
specifications. 

Components include parser, typechecker, 
theorem prover, Hoare sentence prover, and 

MLS tool. 

Theorem prover contains powerful decision 
procedures for integer and rational inequalities. 

New implementation should be ready by end of 
1990. 
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Concluding Observations 

Reasoning about fault-tolerant clock 
synchronization is extremely difficult. 

Proofs involve heavy use of inequalities, algebraic 
manipulations, finite set theory, and induction. 

Protocol designers themselves feel the need for 
mechanized verification tools. 

Benefits of such tools are: 

• Design discipline 

• Efficient location/correction of design errors 

• Design library for future reuse 

• Standardized language for communicating 
designs and proofs 

Specification and verification technology could 
contribute effectively to the foundations of 
reliable engineering. 
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